Goto

Collaborating Authors

 experimental result


DOTA: DistributiOnal Test-time Adaptation of Vision-Language Models

Neural Information Processing Systems

However, deploying these models can be unreliable when significant distribution gaps exist between training and test data, while fine-tuning for diverse scenarios is often costly. This creates a need for methods that can efficiently adapt to new data at test time without expensive retraining. Cache-based test-time adapters serve this purpose by storing representative test samples to guide subsequent classifications. Yet, these methods typically employ naive cache management with limited capacity, leading to severe catastrophic forgetting when samples are inevitably dropped during updates. In this paper, we propose Dota(DistributiOnal Test-time Adaptation), a simple yet effective method addressing this limitation. Crucially, instead of merely memorizing individual test samples, Dotacontinuously estimates the underlying distribution of the test data stream. Test-time posterior probabilities are then computed using these dynamically estimated distributions via Bayes' theorem for adaptation. This distribution-centric approach enables the model to continually learn and adapt to the deployment environment. Extensive experiments validate that Dota significantly mitigates forgetting and achieves state-of-the-art performance compared to existing methods.



MLR-Bench: Evaluating AIAgents on Open-Ended Machine Learning Research Hui Chen Miao Xiong Yujie Lu Wei Han Ailin Deng Yufei He Jiaying Wu Yibo Li

Neural Information Processing Systems

Recent advancements in AI agents have demonstrated their growing potential to drive and support scientific discovery. In this work, we introduce MLR-Bench, a comprehensive benchmark for evaluating AI agents on open-ended machine learning research. MLR-Bench includes three key components: (1) 201 research tasks sourced from NeurIPS, ICLR, and ICML workshops covering diverse ML topics; (2) MLR-Judge, an automated evaluation framework combining LLMbased reviewers with carefully designed review rubrics to assess research quality; and (3) MLR-Agent, a modular agent scaffold capable of completing research tasks through four stages: idea generation, proposal formulation, experimentation, and paper writing. Our framework supports both stepwise assessment across these distinct research stages, and end-to-end evaluation of the final research paper. We then use MLR-Bench to evaluate six frontier LLMs and an advanced coding agent, finding that while LLMs are effective at generating coherent ideas and well-structured papers, current coding agents frequently (e.g., in 80% of the cases) produce fabricated or invalidated experimental results--posing a major barrier to scientific reliability.


Hybrid-Collaborative Augmentation and Contrastive Sample Adaptive-Differential Awareness for Robust Attributed Graph Clustering

Neural Information Processing Systems

Due to its powerful capability of self-supervised representation learning and clustering, contrastive attributed graph clustering (CAGC) has achieved great success, which mainly depends on effective data augmentation and contrastive objective setting. However, most CAGC methods utilize edges as auxiliary information to obtain node-level embedding representation and only focus on node-level embedding augmentation. This approach overlooks edge-level embedding augmentation and the interactions between node-level and edge-level embedding augmentations across various granularity. Moreover, they often treat all contrastive sample pairs equally, neglecting the significant differences between hard and easy positivenegative sample pairs, which ultimately limits their discriminative capability. To tackle these issues, a novel robust attributed graph clustering (RAGC), incorporating hybrid-collaborative augmentation (HCA) and contrastive sample adaptivedifferential awareness (CSADA), is proposed. First, node-level and edge-level embedding representations and augmentations are simultaneously executed to establish a more comprehensive similarity measurement criterion for subsequent contrastive learning.


LBMKGC: Large Model-Driven Balanced Multimodal Knowledge Graph Completion

Neural Information Processing Systems

Multi-modal Knowledge Graph Completion (MMKGC) aims to predict missing entities, relations, or attributes in knowledge graphs by collaboratively modeling the triple structure and multimodal information (e.g., text, images, videos) associated with entities.


NeuroH-TGL: Neuro-Heterogeneity Guided Temporal Graph Learning Strategy for Brain Disease Diagnosis

Neural Information Processing Systems

Dynamic functional brain networks (DFBNs) are powerful tools in neuroscience research. Recent studies reveal that DFBNs contain heterogeneous neural nodes with more extensive connections and more drastic temporal changes, which play pivotal roles in coordinating the reorganization of the brain. Moreover, the spatiotemporal patterns of these nodes are modulated by the brain's historical states. However, existing methods not only ignore the spatio-temporal heterogeneity of neural nodes, but also fail to effectively encode the temporal propagation mechanism of heterogeneous activities. These limitations hinder the deep exploration of spatio-temporal relationships within DFBNs, preventing the capture of abnormal neural heterogeneity caused by brain diseases.


Per-Architecture Training-Free Metric Optimization for Neural Architecture Search

Neural Information Processing Systems

Neural Architecture Search (NAS) aims to identify high-performance networks within a defined search space. Training-free metrics have been proposed to estimate network performance without actual training, reducing NAS deployment costs. However, individual training-free metrics often capture only partial architectural features, and their estimation capabilities are different in various tasks. Combining multiple training-free metrics has been explored to enhance scalability across tasks. Yet, these methods typically optimize global metric combinations over the entire search space, overlooking the varying sensitivities of different architectures to specific metrics, which may limit the final architectures' performance. To address these challenges, we propose the Per-Architecture Training-Free Metric Optimization NAS (PO-NAS) algorithm.


7813e19a86fd73d40f7e811ab15f6d5f-Supplemental-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing Systems

Question: Do the main claims made in the abstract and introduction accurately reflect the3 paper's contributions and scope?4 Answer: [Yes]5 Justification: These claims are substantiated within the paper through detailed descriptions6 of the dataset's structure and the methodologies employed for each analysis task. The answer NA means that the abstract and introduction do not include the claims11 made in the paper.12 The abstract and/or introduction should clearly state the claims made, including the13 contributions made in the paper and important assumptions and limitations. ANo or14 NA answer to this question will not be perceived well by the reviewers.15 The claims made should match theoretical and experimental results, and reflect how16 much the results can be expected to generalize to other settings.17


RiboFlow: Conditional De Novo RNACo-Design via Synergistic Flow Matching

Neural Information Processing Systems

Ribonucleic acid (RNA) binds to molecules to achieve specific biological functions. While generative models are advancing biomolecule design, existing methods for designing RNA that target specific ligands face limitations in capturing RNA's conformational flexibility, ensuring structural validity, and overcoming data scarcity. To address these challenges, we introduce RiboFlow, a synergistic flow matching model to co-design RNA structures and sequences based on target molecules. By integrating RNA backbone frames, torsion angles, and sequence features in an unified architecture, RiboFlow explicitly models RNA's dynamic conformations while enforcing sequence-structure consistency to improve validity. Additionally, we curate RiboBind, a large-scale dataset of RNA-molecule interactions, to resolve the scarcity of high-quality structural data. Extensive experiments reveal that RiboFlow not only outperforms state-of-the-art RNA design methods by a large margin but also showcases controllable capabilities for achieving high binding affinity to target ligands.


Purity Law for Neural Routing Problem Solvers with Enhanced Generalizability

Neural Information Processing Systems

Achieving generalization in neural approaches across different scales and distributions remains a significant challenge for routing problems. A key obstacle is that neural networks often fail to learn robust principles for identifying universal patterns and deriving optimal solutions from diverse instances. In this paper, we first uncover Purity Law, a fundamental structural principle for optimal solutions of routing problems, defining that edge prevalence grows exponentially with the sparsity of surrounding vertices. Statistically and theoretically validated across diverse instances, Purity Law reveals a consistent bias toward local sparsity in global optima. Building on this insight, we propose Purity Policy Optimization (PUPO), a novel training paradigm that explicitly aligns characteristics of neural solutions with Purity Law during the solution construction process to enhance generalization. Extensive experiments demonstrate that PUPO can be seamlessly integrated with popular neural solvers, significantly enhancing their generalization performance without incurring additional computational overhead during inference. The code is available at https://github.com/Kejun0627/PUPO.